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Despite their apparent simplicity, random Boolean networks display a rich variety of dynamical 
behaviors. Much work has been focused on the properties and abundance of attractors. The 
topologies of random Boolean networks with one input per node can be seen as graphs of random 
maps. We introduce an approach to investigating random maps and finding analytical results for 
attractors in random Boolean networks with the corresponding topology. Approximating some 
other non-chaotic networks to be of this class, we apply the analytic results to them. For this 
approximation, we observe a strikingly good agreement on the numbers of attractors of various 
lengths. We also investigate observables related to the average number of attractors in relation to 
the typical number of attractors. Here, we find strong differences that highlight the difficulties in 
making direct comparisons between random Boolean networks and real systems. Furthermore, we 
demonstrate the power of our approach by deriving some results for random maps. These results 
include the distribution of the number of components in random maps, along with asymptotic 
expansions for cumulants up to the 4th order. 

PACS numbers: 89.75.Hc, 02.10.Ox 



INTRODUCTION 

Random Boolean networks have long enjoyed the at- 
tention of researchers, both in their own right and as 
simplistic models, in particular for gene regulatory net- 
works. The properties of these networks have been stud- 
ied for a variety of network architectures, distributions of 
Boolean rules, and even for different updating strategies. 
The simplest and most commonly used strategy is to syn- 
chronously update all nodes. Networks of this kind have 
been investigated extensively, see, e.g., 0, El S IE • 

The networks we consider are, generally speaking, such 
where the inputs to each node are chosen randomly 
with equal probability among all nodes, and where the 
Boolean rules of the nodes are picked randomly and in- 
dependently from some distribution. In other words, re- 
alizing a network of N nodes consists of three steps to be 
performed for each node: (a) choose the number of in- 
puts, called in-degree or connectivity, and here denoted 
K in , (b) choose a Boolean function of K in inputs to be 
the rule of the node, and (c) choose K in nodes that will 
serve as the inputs to the rule. These steps must be done 
in the same way for all nodes, and be independent be- 
tween nodes. Additionally, though step (c) may be done 
with or without replacement, it must give equal proba- 
bility to all nodes, implying that the out-degree of each 
node is drawn from a Poisson distribution. 

The network dynamics under consideration is given by 
synchronous updating of the nodes. At any given time 
step t, each node has a state of true or false. The state 
of any node at time t + 1 is that which its Boolean rule 
produces when applied to the states of the input nodes at 
time t. Consequently, the entire network state is updated 
dctcrministically, and any trajectory in state space will 
eventually become periodic. Thus, the state space con- 



sists of attractor basins and attractors of varying length, 
and it always has at least one attractor. 

In this work we determine analytically the numbers 
of attractors of different lengths in networks with con- 
nectivity (in-degree) one. We compare these results to 
networks of higher connectivity and find a remarkable de- 
gree of agreement, meaning that networks of single- input 
nodes can be employed to approximate more complicated 
networks, even for small systems. For large networks^a 
reasonable level of correspondence is expected. See y| 
on effective connectivity for critical networks, and |j on 
the limiting numbers of cycles in subcritical networks. 

Random Boolean networks with connectivity one have 
been investigated analytically in earlier work |lCt . In 
those papers, a graph-theoretic approach was employed. 
The approach in [lfj starts with a derivation that also 
is directly applicable to random maps. For a random 
Boolean network with connectivity one, a random map 
can be formed from the network topology. Every node 
has a rule that takes its input from a randomly chosen 
node. The operation of finding the input node to a given 
node forms a map from the set of nodes into itself. This 
map satisfies the properties of a random map. 

For highly chaotic networks, with many inputs per 
node, the state space can be compared to a random map. 
Networks where every state is randomly mapped to a suc- 
cessor state are investigated in [l2j |. 

In 0] j only attractors with large attractor basins are 
considered, and the main results arc on the distribution 
of attractor basin sizes. We extend these calculations and 
are able to also consider attractors with small attractor 
basins, and include these in the observables we investi- 
gate. 0] focuses on proving superpolynomial scaling, 
with system size, in the average number of attractors, as 
well as in the average attractor length, for critical net- 
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works with in-dcgrce one. Our calculations reveal more 
details for cycles of specific lengths. 

For long cycles, especially in large networks, there are 
some artefacts that make comparisons to real networks 
difficult. For example, the integer divisibility of the cycle 
length is important, see, e.g., H El El El E3 Also, the 
total number of attractors grows superpolynomially with 
system size in critical networks |llL ll-'J , and most of the 
attractors have tiny attractor basins as compared to the 
full state space 0, El ^| . In this work such artefacts 
become particularly apparent, and we think that long 
cycles are hard to connect to real dynamical systems. 

On the other hand, comparisons to real dynamical sys- 
tems still seem to be relevant with regard to fixed points 
and some stability properties 0, llfif . An interesting way 
to make more realistic comparisons regarding cycles is to 
consider those attractors that are stable with respect to 
repeated infinitesimal changes in the timing of updating 
events 0. 

Our approach provides a convenient starting point for 
investigations of random maps in general. Random maps 
have been the subject of extensive studies, see, e.g., 

ei us mm nni mm, and ais ° ^3 f ° r a b °° k 

that includes this subject. For networks with in-degree 
one, our approach enables analytical investigations of far 
more obscrvables than have been analytically accessible 
with previously presented methods. This could provide a 
starting point for understanding more complicated net- 
works, and a tool for seeking obscrvables that may reveal 
interesting properties in comparisons to real systems. 

Several results on random maps can be obtained in 
a straightforward manner from our approach. One key 
property of a random map is the number of components 
in the functional graph, i.e., the number of separated 
islands in the corresponding network. We rederive a rel- 
atively simple expression for the distribution of the num- 
ber of components, along with asymptotic expansions for 
cumulants up to the 4th order. To a large extent, the 
asymptotic results are new. 

In the results section, we show some numerical compar- 
isons between random Boolean networks of multi-input 
nodes and networks with connectivity one. The results 
show similarities that are stronger than we had expected. 
In future research, it is possible that the connection be- 
tween networks with single and multiple inputs per node 
could be better understood by combining our approach 
with results and ideas from In E3> the connected 
Boolean networks consisting of one two-input node and 
an arbitrary number of single-input nodes are investi- 
gated. Although there are difficulties in comparing at- 
tractor properties directly with real dynamical systems, a 
satisfactory explanation of the similarities between these 
networks, with single vs. multiple inputs per node, may 
provide keys to the understanding of dynamics in net- 
works in general. 



THEORY 

In a network with only one input per node, the network 
topology can be described as a set of loops with trees of 
nodes connected to them. To understand the distribu- 
tion of attractors of different lengths, it is sufficient to 
consider the loops. All nodes outside the loops will after 
a short transient time act as slaves to the nodes in the 
loops. Also, the nodes in a loop that contains at least 
one constant rule, will reach a fixed final state after a 
short time. 

All nodes that are relevant to the attractor struc- 
ture are contained in loops with only non-constant 
(information-conserving) rules. In other words, all the 
relevant elements, as described in are contained in 
such loops. We let fi denote the number of information- 
conserving loops and let /t denote the number of nodes 
in such loops. 

We divide the calculations of the wanted observables 
into two steps. First, we present general considerations 
for loop-dependent obscrvables. Then, we apply the gen- 
eral results to investigate obscrvables connected to the 
attractor structure. Before the second step, we derive ex- 
pressions for the distributions of /i and /}, together with 
asymptotic expansions for corresponding means and vari- 
ances, to illustrate the meaning and power of the general 
expressions. 

Basic Network Properties 

Throughout this paper, N denotes the number of nodes 
in the network, and L the length of an attractor, be it a 
cycle (L > 1) or a fixed point (L = 1). For brevity we 
use the term L-cycle, and understand this to mean an at- 
tractor such that taking L time steps forward produces 
the initial state. When L is the smallest positive integer 
fulfilling this, we speak of a proper L-cycle. We denote 
the number of proper L-cycles, for a given network real- 
ization, by C'l ■ The arithmetic mean over realizations of 
networks of a size N is denoted by {Cl)n, so the mean 
number of network states that are part of a proper L- 
cycle is L(C L ) N . 

Related to Cl is fii., the number of states that reap- 
pear after L time steps and hence are part of any L-cycle, 
proper or not. Analogous to (Cl)n, we let (0.l)n denote 
the average of Q,l for networks with N nodes. If (Q.l)n 
is known for all L, (Cl) n can be calculated from the set 
theoreticprinciple of inclusion-exclusion. See Supporting 
Text to 0. 

For large N, the value of (Cl)n is often misleading, in 
the sense that some rarely occurring networks with ex- 
tremely many attractors dominate the average. To better 
understand this phenomenon, we introduce the observ- 
ables Rjtf and (fij,)^. denotes the probability that 
Ol 7^ for a random network of N nodes, and (fiz,)® 
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is the geometric mean of Ql for TV- node networks with 

In the case that every node has one input, the quan- 
tities (£Il)n, Rn and can be calculated analyti- 
cally for any TV. In the one-input case, the large- N limit 
of (£Il)ni (^l)ooj is identical to the corresponding limit 
for subcritical networks of multi-input nodes, as derived 
in ||. Furthermore, we discuss in to what extent criti- 
cal networks of multi-input nodes are expected to show 
similarities to networks of single-input nodes. 

For random Boolean networks of one-input nodes, 
there are only two relevant control parameters in the 
model description, apart from the system size N. There 
are four possible Boolean rules with one input. These are 
the constant rules, true and FALSE, together with the 
information-conserving rules that either copy or invert 
the input. The distribution of true vs. false is irrele- 
vant for the attractor structure of the network. Hence, 
the relevant control parameters are the probabilities of 
selecting inverters and copy operators when a rule is ran- 
domly chosen. We let r c and r 1 denote the selection 
probabilities associated with copy operators and invert- 
ers, respectively. 

In networks with one- input nodes, the total probability 
of selecting an information-conserving rule is r = r c +r l . 
In analogy with the definition of r, we also define Ar = 
r c — r 1 . In most cases it is more convenient to work with 
r and Ar than with r and r 1 . The quantities r and Ar 
can also be seen as measures of how a network responds 
to a small perturbation. From this viewpoint, r and Ar 
are average growth factors for a random perturbation 
during one time step. For r, the size of the perturbation is 
measured with the Hamming distance to an unperturbed 
network. For Ar, the Hamming distance is replaced by 
the difference in the number of true values at the nodes. 

To get suitable perturbation-based definitions of r and 
Ar, we consider the following procedure: 
Find the mean field equilibrium fraction of nodes that 
have the value true. Pick a random state from this 
equilibrium as an initial configuration. Let the system 
evolve one time step, with and without first toggling the 
value of a randomly selected node. The average fraction 
of nodes that in both cases copy or invert the state of 
the selected node are r c and r 1 , respectively. Finally, let 
r = r c + r 1 and Ar = r c — r 1 . 

It is easy to check that the perturbation-based defini- 
tions of r and Ar are consistent with the rule selection 
probabilities for networks of single-input nodes. By using 
perturbation-based definitions of r and Ar, those quan- 
tities are well-defined for networks with multiple inputs 
per node 0], and this allows for direct comparisons to 
networks with one input per node. 



Products of Loop Observables 

In all of our analytical derivations for networks of 
single-input nodes, we have a common starting point: 
We consider observables, on the network, that can be ex- 
pressed as a product of observables associated with the 
loops in the network. 

To make a more precise description, we let Af be any 
network of single-input nodes, and v be the number of 
loops in Af. The dynamical properties of a loop are deter- 
mined by its length A g Z + , and a property s e {0, +, — } 
that we refer to as the sign of the loop. For a loop that 
does not conserve information, i.e., a loop that has at 
least one constant node, s = 0. All other loops have only 
inverters and copy operators. If the number of inverters 
is even then s = +, and if it is odd s = — . 

Let <7| denote a quantity that is fully determined by 
the length A and the sign s of a loop. We define the 
product G(Af) of the loop observable g s x in Af as 

G{Af) = Y[ 5 a ■ (1) 

i=l 

where Ai , . . . , \ v and s\ , . . . , s„ are the lengths and signs, 
respectively, of the loops in the network Af. 

If the network topology is given, but the rules are 
randomized independently at each node, the average of 
G(Af) can be calculated according to 

< G >a- = f[(9)x, , (2) 

i=l 

where A = (Ai, . . . , A^), and (g)\ is the average of g x 
under random choice of rules. 

We proceed by also taking the randomization of the 
network topology into account. Let v\ denote the num- 
ber of loops of lengths A = 1,2,..., and let v = 
{y\,V2,---). Then, the average of (G)? over network 
topologies, in networks with N nodes, can be written 
as 

oo 

(G)iV= £ P ^)U((9)xr , (3) 

i7eN°° A=l 

where Pn{v) is the probability that the distribution of 
loop lengths is described by v in a network with N nodes. 
We use infinities in the ranges of the sum and the prod- 
uct for formal convenience. Bear in mind that Pn(v) is 
nonzero only for such distributions of loop lengths as are 
achievable with TV nodes. 
From , we know that 

$ Nl TT 1 
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where 



Network Topology 



A=l 



(5) 



Eq. Q provides a fundamental starting point for all of 
our derivations. In its raw form, however, eq. is dif- 
ficult to work with. In Appendix A we present how to 
combine eq. (0} with eq. J2J), to obtain 



(G) 



N 



1 + — 

N 



N 



exp > ; z 

*=0 A=l 



A 



(0) 



To continue from eq. I©, we express (g}\ in terms of 
more fundamental quantities. With r c and r l as the 
probabilities that the rule at any given node is a copy 
operator or an inverter, respectively, the probability p\ 
that a loop of length A has an even number of inverters 
is given by 



Px = iK^+rY + ^-r 1 ) 



I\Ai 



(7) 



Similarly, the probability for an odd number of inverters 
is given by 



Pa 



„c 



With r 



w 

r l and Ar 



I\A 



-r l ) x ]. (8) 
f 1 , we see that 



Pa" 
Pa 



i[r A + (Ar) A ] 
±[r A -(Ar) A ] 



and 



Pa = 1 - r X 



(9) 
(10) 



(11) 



A loop that does not conserve information will always 
reach a specific state in a limited number of time steps. 
Such loops are not relevant for the attractor properties we 
are interested in. Thus, gl should not alter the products, 
and we have = 1. This gives us 



(9) 



A 



pUI 

<?A 



1 



Pa 9x 



Phi 



where 

.g A = i[r A + (Ar) A ].g+ + i[r A -(Ar) A ] 5 



A 



(12) 
(13) 

(14) 



Insertion of eq. (|13|) into eq. I© and the power series 
expansion of ln(l — x) yield 



(G) 



N 



N 



N 



(1 



z=0 



oo 

rz)exp^^z A . (15) 

A=l 



Eq. (|15fl is the starting point for all network properties 
we calculate. 



In this section, we investigate the distributions of the 
number of information-conserving loops (i and the num- 
ber of nodes in those loops, fi. Both [i and jx are indepen- 
dent of whether the information-conserving loops have 
positive or negative signs. This means that g^ = g^ for 
all A = 1, 2, . . .. Hence, we let gf 



9\ = .9 A i and § et 



9\ 



9fr X 



(16) 



which means that eq. (|15|1 turns into 



1 + l 



N 



oo + 

■9a 



z=0 



(1 -rz)exp V ^-{rzf . (17) 
— 4 A 



A=l 



To investigate the distributions of /i and jl, we will use 
generating functions. A generating function is a function 
such that a desired quantity can be extracted by calcu- 
lating the coefficients in a power series expansion. 

Let [w k ] denote the operator that extracts the fcth co- 
efficient in a power series expansion of a function of w. 
Then, the probabilities for specific values of jj, and jl, in 
iV-node networks, are given by 



and 



P N (v = k) = [w h ]{G) N 



P N {fi = k) = [w k }(G) N 



if 9 x = w 



W 



(18) 



(19) 



In eq. (|18|l . every loop is counted as one, in powers of w, 
whereas in eq. (|19f) . every node in each loop corresponds 
to one factor of w. 

For probability distributions described by generating 
functions, there are convenient ways to extract the sta- 
tistical moments. Let m denote (i or fj,. Then, (m) and 
(m 2 ) can be calculated according to 



= d w \ w=1 ^2 p N{m = k)w k 
= d w \ w =i{G)N 



(20) 
(21) 



and 



(l + d w )d w \ w=1 J2 p N(m = k)w k (22) 



k=0 



(1 + d w )d w \ w=1 (G) N 



(23) 



Starting from eqs. H18|) - I|23|l . we derive some results for 
(i and fl. The derivations are presented in Appendix C. 
For large N, the probability distribution of /x approaches 
a Poisson distribution with average ln[l/(l — r)}, whereas 
the limiting distribution of p, decays exponentially as 
P(fi = k) oc r k . 
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In Appendix C, we also calculate asymptotic expan- 
sions for the mean values and variances of \i and fx, in 
the case that r = 1. The technique to derive asymptotic 
expansions for products of loop observables is presented 
in Appendix B. 

For r = 1 , [i is equivalent to the number of components 
in a random map. Similarly, (1 corresponds to the size of 
the invariant set in a random map. The invariant set is 
the set of all elements that can be mapped to themselves 
if the map is iterated a suitable number of times. Such 
elements are located on loops in the network graph. 

Using the tools in Appendices B and C, one can equally 
well derive asymptotic expansions for higher statistical 
moments as for the mean and variance. In the results 
section, we state the leading orders of the asymptotic 
expansions for the 3rd and 4th order cumulants to the 
distribution of the number of components in a random 
map. 



Now, (CIl}n can be calculated from the insertion of 
cq. l(2*T|) into eq. (|T5*j) . The arithmetic mean, (SIl)n, is, 
however, in many cases a bad measure of Hl for a typical 
network. To see this, we investigate the geometric mean 

of n L . 

We let be the geometric mean of nonzero fi^, 

and Rj^ be the probability that £Il ^ 0, for networks of 
size N . The probability distribution of log 2 SIl is gener- 
ated by a product of loop observables according to 



P N (\og 2 tl L = k) = [w k ](G) 



N 



(28) 



with 



gcd(A,L) I r 

1 ±[r A + (Ar) A ] if A L | A 



(29) 



The probability that VIl = is not included in cq. i|28|) 
for k <E N. All other possible values of Ql are included, 
and this means that 



On the Number of States in Attractors 



R 1 ^ — \ w =i(G)n 



(30) 



For a given Boolean network with in-degree one, the 
number of states fij, in L-cycles can be expressed as a 
product of loop observables. If ^l is calculated sepa- 
rately for every loop in the network, the product of these 
quantities gives Ql for the whole network. 

Every loop with an even number of inverters and length 
A can have 2 gcd ' A:i ' states that are repeated after L 
timcsteps, where gcd(a, b) denotes the greatest common 
divisor of a and b. Hence, such a loop will contribute 
with the factor g^ = 2 gcd ( A ' L ) to the product. Similarly, 
for a loop with an odd number of inverters, this factor 
is gl = 2 gcd ( A < L ) if Lj gcd(A, L) is even and g^ = oth- 
erwise. The requirement that Lj gcd(A, L) is even comes 
from the fact that the state of the loop should be inverted 
an even number of times during L timcsteps. 

The condition that L/ gcd(A,L) is even can be refor- 
mulated in terms of divisibility by powers of 2. Let Al de- 
note the the maximal integer power of 2 such that Al | L, 
where the relation | means that the number on the left 
hand side is a divisor to the number on the right hand 
side. Then, we get 



Lj gcd(A, L) odd^ A L | A 



With 



+ _ 2 gcd(A,L) 



and 



2 gcd(A,L) jf \ L \ 

if Az, | A 

inserted into eq. (|14[) . we get 



g x = 2 gcd ( A < L > 



r x if A L f A 

i[r A + (Ar) A ] if Al | A 



(24) 
(25) 

(26) 
(27) 



Furthermore, it is clear that 



R^io g2 (n L }% = R^(io g2 n L ) 



N 



= d w \ w =\(G) n , 



(31) 
(32) 



where the average of log 2 ^l is calculated with respect 
to networks with Ql ^ 0. 

Insertion of eq. (|29|l into eq. I|15|) yields 



N 



N 



(G) N = 
where 

oo 

F L (w,z) = (1 - rz) cxp 

oo 

x exp^] 



Fl(w, z) 



(33) 



w gcd(A,L) 

r A z A 

A 



A=l 

gcd(fcAi.L) 



k=l 



2k\, 



■[(Ar 



ifeAr 



r fcAjj l z feAl 



(34) 



where Al is the largest integer power of 2 that divides L. 

.Fl provides a convenient way to describe our results 
this far. We have 



L)N 



= 1 + ^ 



Rn = I 1 + ^ 



d_ 

N 



N 



N 



Fl(2,z) , 



2=0 



and 

(SIl)n = CX P 



In 2 
R 17 



__ 

TV 



z=0 
w—1 



F L {w,z) 



(35) 
(36) 

(37) 
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Note that L = can be inserted directly into eq. (|34|l to 
investigate the distribution of the total number of states 
in attractors. This works because is divisible by any 
non-zero number, and hence gcd(A, 0) = A for all A G Z + . 
Insertion of L = into eq. (|34|l . together with standard 
power series expansions, yields 



F (w,z) 



1 



1 



(38) 



1, which means that R% = 1. 



Eq. jSHJ gives F (l,z) 
The result R% = 1 is easily understood, because every 
network must have at least one attractor, and thus a 
nonzero Oo- 

The limits {Q L }oc, and of (fl L ) N , R%, and 

(Qi)^ as A" — > oo are in many cases easy to extract. For 
power series of z with convergence radii larger than 1, we 
have the operator relation 



lim 



d z 

1 + i 



N 



z=0 



(39) 



which means that the limit can be extracted by letting 
z = 1 in the given function. In the cases that fulfill the 
convergence criterion above, we get 



and 



(0^00=^(2,1) , 
R^ = F L (l,l) , 



(n L )g = expWn2d w \ w=1 lnF L {w,l) 



(40) 
(41) 



(42) 



With one exception, all of cqs. l|l0" |l -l|l2" jl hold if r < 1. 
The exception is that eq. H4U|) does not hold if L = and 
r > 1/2. 

Using the tools in Appendix B, we find that 



(0 



1 - r 
1 - 2r 



0)N 




for r < | 



for r = i 



(43) 



/pT e [ln2r-l+l/(2r)]iV fo r r > 1 



and 



(jy G 



2*7(1- 



for r < 1 



JV 



for r = i 



(44) 



for large iV. Note that the the leading term in the asymp- 
tote of (Oo) n for r > 1/2 comes from the pole in -Fb(2, z) 
at z = l/(2r). If r > 1/2, then z = l/(2r) lies inside the 
contour (z — 1/3 1 = 2/3, which is used as integration path 
in Appendix B. See Appendices C and D for examples on 
how to use the technique presented in Appendix B. 

Only if r < 1/2 do (^o)n and (Oo)jv have the same 
qualitative behavior for large N. Otherwise the broad tail 



in the distribution of /t dominates the value of (fio)iv- If 
1/2 < r < 1, (^o)jv approaches a constant, while (O )at 
grows exponentially with N. For the critical case, r = 1, 
the qualitative difference lies in the power of N in the 
exponent. 

For L / 0, the difference between {^l)n and (^l)jv 
is less pronounced. Both (£Il)n and (SIl)^ approach 
constants as Af — > co if r < 1, and they both grow like 
powers of N if r = 1. It is also worth noting that R^ ^ 
for r < 1, whereas R^ = if r = 1 but Ar < 1. 
In the latter case, i?^ approaches like A^~ 1 ^ 4Al -^; see 
Appendix D. If r = 1 and Ar = 1, i.e., the network has 
only copy operators, = 1 for all N £ Z + . 

In Appendix D, we investigate (£Il)n and f° r 
L > 0, in detail for the case that r = 1 and Ar < 1, 
which corresponds to the most commonly occurring cases 
of critical networks. For large N, we have the asymptotic 
relations 



(n L ) N cx n Ul 



(45) 



for the arithmetic mean of the number of L-cycle states, 
and 



(0 2 



(46) 



for the corresponding geometric mean, with the expo- 
nents Ul and ul given by eqs. (|D23|I and (|D17(I in Ap- 
pendix D. For large L, we have 



2 L 

U ^2L 



(47) 



The other exponent, ul, which appears in the scaling 
of the geometric mean, is trickier to estimate. However, 
we derive an upper bound from <p(t) < £, where ip is the 
Euler function, as described in Appendix D. From this 
inequality, combined with eqs. l|D17j) and IjDIOp . we find 
that 



ln2,. n 
ul < — d ( L ) 



(48) 



where d(L) is the number of divisors to L. To show that 
ul is not bounded for arbitrary L, we let L — 2 m , where 
m G N, and find that hi, = (m + l)/2 and 



ln2, 

ul = -3~(m + 1) . 
o 



(49) 



Although (SIl)n and (^l)tv share the property that 
the they grow like powers of N, the values of the powers 
differ strongly in a qualitative sense. Yet neither case 
has an upper limit to the exponent in the power law. 
Thus, the observation that the total number of attrac- 
tors grows superpolynomially with A^ is true not only for 
the arithmetic mean, but also for the geometric mean. 
This is consistent with the derivations in ^jji that show 
that the typical number of attractors grows faster than 
polynomially with N. 
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RESULTS 

Our most important findings are the expression for the 
expectation value of products of loop obscrvables on the 
graph of a random map [eq. 1)15(1] and the asymptotic ex- 
pansions for such quantities. Using these tools, we derive 
new results on basic properties of random maps, and on 
Boolean dynamics on the graph of a random map. In 
the latter case, we investigate random Boolean networks 
with in-degree one, and compare those to more compli- 
cated random Boolean networks. 



Random Maps 

For critical random Boolean networks with in-degree 
one, all loops conserve information. This is because 
no constant Boolean rules are allowed in a critical net- 
work. For such a network, the number of information- 
conserving loops, n, is also the number of components 
of the network graph. This graph is also the graph of a 
random map. Thus, (i can be seen as the number of com- 
ponents in a random map. Analogous to the interpreta- 
tion of [A, the number of nodes in information-conserving 
loops, /}, can be seen as the number of elements in the 
invariant set of a random map. 

We derive the probability distributions of (i and /i, in 
a form that also can be obtained from other approaches 
[20ll2l| . For critical networks, we derive asymptotic ex- 
pansions for the means and variances of [i and jl, and 
find that 

(ji) = |(ln2JV + 7 ) + i\/2^V- 1/2 + 0(N- 1 ) , (50) 
ct 2 (^) = ±(ln2A + 7 ) - ±tt 2 + i(3 - 21n2)V27iV- 1/2 
+ 0(iV- 1 ), (51) 
(A) = \V2^N - i + ^V^N- 1 ' 2 + OiN- 1 ) , (52) 

and 

CT 2 (/i) = 1(4 - tt)A - ±V2nN- ^(3tt - 8) 

+ 0(N~ 1/2 ), (53) 

where N is the number of nodes in the network, and 
7 is the Euler-Mascheroni constant. These expansions 
converge rapidly to corresponding exact values, for in- 
creasing TV. 

The leading terms ^(\n2N + 7) of eqs. |50"jl and 
have been derived earlier. See 0, |2|| l29j on (fj,) and 
l28l l29l on a 2 (/1) . The leading term of eq. (|52|l is found 
in I23. The other terms in eqs. (|50|I - H53() appear to be 
new. Some additional terms are presented in eqs. I|C23() - 

ra . 

The technique presented in Appendix B let us also cal- 
culate expansions for cumulants of higher orders. The 



leading orders of the 3rd and 4th cumulants for the dis- 
tribution of /i give an interesting hint. Let (/i 3 ) c and 
(/i 4 ) c denote those cumulants, respectively. Then, we get 

(^)c = (^) + K(3)-f 7 r 2 + 0(7V 1 / 2 ) (54) 
and 

( M 4 ) c = ( M ) + f C(3) - - t^ 4 + 0{N 1 / 2 ) , (55) 

where £(s) denotes the Ricmann zeta function. All cumu- 
lants from the 1st to the 4th order grow like \ In AT. One 
could guess that all cumulants have this property. If so, 
the distribution of /j, is very closely related to a Poisson 
distribution for large N. (Bear in mind that all cumu- 
lants for a Poisson distribution are equal to the average 
for the distribution.) 

Furthermore, it seems like the process of calculating 
higher order cumulants, as well as including more terms 
in the expansions, can be fully automated. As far as we 
know, only a very limited number of terms, and only for 
mean values and variances, has been derived in earlier 
work. 

Random Boolean Networks 

Our main results from the analytical calculations are 
the expressions that yield the arithmetic mean (Oi)jv, 
and the geometric mean of the number of states 

in L-cycles. See eqs. (|3^|) - (|3T|l on expressions for general 
N, and eqs. (f^ - lj^ on expressions valid for the high- 
N limit. In Appendix E, we present derivations that 
relate this work to results from ||. These derivations 
yield an expression suitable for calculation of exact values 
of (£Il)n via a power series expansion of the function 
F L (2,z) in eq. 

For the arithmetic means, the number of proper L- 
cycles (Cl) n can be calculated from the number of states 
(He) n in all ^-cycles, provided that (f^) n is known for all 
£ that divide L. This is done via the inclusion-exclusion 
principle as described in Supporting Text to [I| ■ For the 
corresponding geometric means we can not use a similar 
technique, because such means do not have the needed 
additive properties. 

Our results on random Boolean networks are divided 
into two parts. First, we illustrate our quantitative re- 
sults on networks with in-degree one. To a large extent, 
the qualitatively results are expected from earlier publi- 
cations. 

From [loj j. we know that in networks with in-degree 
one, as N — + 00, the typical number of relevant vari- 
ables approaches a constant for subcritical networks, and 
scales as y/N for critical networks. This indicates that 
for subcritical networks, the average number of L-cycles 
and the average number of states in attractors are likely 
to approach constants as N — > 00. 
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FIG. 1: The average number of proper L-cycles as a function 
of N for different L, for networks with single-input nodes, 
r = 1 in (a), and r = 3/4 (solid lines) and r = 1/2 (dotted 
lines) in (b). In (a), L is indicated in the plot. In (b), L is 3, 
5, 7, 1, 2, 4, 6, and 8 for r = 3/4 and 7, 5, 3, 8, 6, 4, 2, and 
1 for r — 1/2, in that order, from bottom to top along the 
right boundary of the plot area. In (b), the curves for L = 3 
and L = 5 for r = 3/4 essentially coincide at the right side of 
the plot, whereas they split up to the left, with L = 3 as the 
upper curve there. 



On the other hand, |6j points out that the probability 
distributions of the number of cycles in critical networks 
have very broad tails. Hence, the arithmetic mean can 
be much larger than the median of the number of cycles, 
and this may also be the case for subcritical networks. 
In 0, it is found that this effect leads to divergence as 
N — > oo, in the mean number of attractors, for networks 
with the stability parameter r in the range r > 1/2. It is 
also found that the mean number of cycles of any specific 
length L converges for large N. For critical networks, 
it is clear that both the typical number and the mean 




FIG. 2: The average number of proper L-cycles for networks 
with N = 100 and r = 3/4, as function of L. Ar = 3/4 (thin 
solid line), Ar = (thick solid line) and Ar = —3/4 (dotted 
line). Note the importance of what numbers divide L. 



number of attractors grow superpolynomially with N, in 
networks with in-degree one |l4|. 

Quantitative results that reflect the above properties 
for networks of finite sizes are, however, for the most part 
highly non-trivial to obtain from earlier work. We let 
figs. illustrate our results in this category. Regarding 
fig. |21 it is important to note that the geometric mean 
of the number of states in attractors can be obtained 
directly from [ic| . 

In the second part of our results on random Boolean 
networks, we compare networks with multiple inputs per 
node to networks with a single input per node. From 
a system theoretic viewpoint, this part is the most in- 
teresting, because a general understanding of the multi- 
input effects vs. single-input effects in dynamical net- 
works would be very valuable. Alth oug h this issue have 
been addressed before, in, e.g., 0, llCj . our results are 
only partly explained. These results are illustrated in 

figs. USED 

Fig-fflshows the numbers of attractors of various short 
lengths as a function of system size, plotted for different 
values of the stability parameter r. We let Ar = 0, cor- 
responding to equal probabilities of inverters and copy 
operators in the networks. For critical networks, with 
r = 1, the asymptotic growth of the average number of 
proper L-cycles, (Cl)n, is a power law, while (Cl)n ap- 
proaches a constant for subcritical networks as N goes to 
infinity. 

For networks with Ar ^ 0, the prevalences of copy 
operators and inverters are not identical. Cycles of even 
length are in general more common then cycles of odd 
length. An overabundance of inverters strengthens this 
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FIG. 3: Arithmetic and geometric means of the number of 
states, S7o, in attractors. (fio)iv (solid lines) and (fio)iv (dot- 
ted lines) for r = 1/2, r = 3/4 and r — 1, in that order, from 
the bottom to the top of the plot. Note that both {Qq)n and 
(fio)jv are independent of Ar. 

difference, and conversely a lower fraction of inverters 
makes the difference less pronounced. See fig. |21 which 
shows the symmetric case Ar = and the extreme cases 
Ar = ±r. 

The total number of attractors, (C)jy, and the total 
number of states in attractors, (n )^, can diverge for 
large TV, even though the number of attractors of any 
fixed length converges. This is true for subcritical net- 
works with r > 1/2, and is illustrated in figs. |3 and 0Ji. 
The growth of (f2o)i\r is exponential if r > 1/2. Interest- 
ingly, there is no qualitative difference in the growth of 
(fio) TV when comparing the critical case of r = 1 to the 
subcritical ones with 1 > r > 1/2. 

For r < 1/2, both (C)jv an d (Orj)iv converge to con- 
stants for large TV. In the borderline case r — 1/2, (f2o)iv 
diverges like a square root of TV, but (C) n seems to ap- 
proach a constant. See fig. UJ). 

The number of states in attractors, O , of a single- 
input node network is directly related to the total num- 
ber of nodes, (i, that are part of information-conserving 
loops. Every state of those nodes corresponds to a state 
in an attractor, and vice versa. Thus, Qo = 2 M , meaning 
that 

(Oo>iv = (2 A ) (56) 

and 

(O >£ - 2<« . (57) 

If 1/2 < r < 1, ln(f2o)jv grows linearly with TV. This 
stands in sharp contrast to (/}}, which grows like VTV 
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FIG. 4: The arithmetic mean of the number of attractors with 
lengths L < L max in networks with iV single-input nodes, for 
different values of N. In (a) N = 10, 10 2 , . . . , 10 5 for r = 1 
(thin solid lines) and N = 10, . . . , 10 4 for r = 3/4 (thin dotted 
lines). In (b) TV = 10, 10 2 ,10 3 (thin solid lines) for r = 1/2 
and JV — 10 for r = 1/4 (thin dotted line). For all cases, Ar = 
0. The thick lines in (a) and (b) show the limiting number 
of attractors when TV — » do. The arrowhead in (b) marks 
this limit for L ma x = 10 7 for r — 1/2. The small increase in 
the number of attractors when L max is changed from 10 3 to 
10 7 indicates that (C)n converges when TV — > oo. Note the 
drastic change in the y-scale between the case r > 1/2 and 
r < 1/2. 



for r = 1 and approaches a constant for r < 1 as TV — > 
oo. Hence, the distribution of fi has a broad tail that 
dominates (Qq)n if r > 1/2. This can be understood 
from the limit distribution of jl for large TV. For this 
distribution, we have Poo (A = ^ rk i which means 
that r must be smaller than 1/2 for the sum of 2 k P QO (fi = 
k) over k to be convergent. Similar, but less dramatic, 
effects occur when forming averages of VI l for L ^ 0. The 
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FIG. 5: Comparison between simulations for power law in- 
degree networks of size N = 20 (bold lines) and the corre- 
sponding networks with single- input nodes (thin lines). The 
fitted networks have identical values for r, Ar, and N. The 
solid lines show the number of fixed points, whereas the 
dashed and dotted lines show the number of 2-cycles plus 
fixed points and the total number of attractors, respectively. 
The probability distribution of in-degrees satisfies pk oc K 1 , 
where K is the number of inputs. The power law networks 
use the nested canalyzing rule distribution presented in ||. 

arithmetic mean is in many cases far from the typical 
value. This is particularly apparent for long cycles in 
large networks that are critical or close to criticality. 

In [llj . it is shown that the typical number of attrac- 
tors grows superpolynomially with TV in critical random 
Boolean networks with connectivity one. From a different 
approach, we find the consistent result that {G)% grows 
superpolynomially, where (C)% is the geometric mean 
of the number of attractors. We conclude this from our 
investigations of the geometric mean of the number of 
states in L-cycles, (f2i)^. Here, we use the inequality 
(C)jv > (Ql)n/L, and the result that there is no upper 
bound to Hl in the relation oc N hh , which holds 

asymptotically for large A (see eq. (|49fl ). 

All the properties above are derived and calculated for 
networks with one input per node, but they seem to a 
large extent to be valid for networks with multi-input 
nodes. From 0, we know that for subcritical networks 
the limit of (Cr) n as A — > oo is only dependent on r 
and Ar. Hence, we can expect that (Cl)n for a subcriti- 
cal network with multi-input nodes can be approximated 
with (Cx)jv) calculated for a network with single-input 
nodes, but with the same r and Ar. 

For the networks in ||, with a power law in-degree 
distribution, the single-input approximation fits surpris- 
ingly well, which is demonstrated in fig. |SJ Not only are 
the means of the numbers of attractors of different types 



FIG. 6: A cross-section of fig. at 7 = 2.5, with simula- 
tion results for the power law in-degree networks (bold lines), 
and the corresponding single- input networks (thin lines) . The 
distributions of the number of attractors of different types 
are presented with cumulative probabilities, along with the 
corresponding means (short vertical lines at the bottom of 
the plot). The solid lines show the number of fixed points, 
whereas the dashed and dotted lines show the number of 2- 
cycles plus fixed points and the total number of attractors, re- 
spectively. Note that the medians are found where the curves 
for the probability distributions intersect 1/2 on the y-axis. 

reproduced by this approximation, but the distributions 
of these numbers are also very similar, as is shown in 

fig. El 

For the critical Kauffman model with in-degree 2, we 
perform an analogous comparison. The number of nodes 
that are non-constant grows like A 2 / 3 for large A |^.ll3|. 
Furthermore, the effective connectivity between the non- 
constant nodes approaches 1 for large A Jj|. Hence, one 
can expect that this type of A-node Kauffman networks 
can be mimicked by networks with A' = A 2//3 one-input 
nodes. For those networks, we choose r = 1 and Ar = 0, 
which are the same values as for the Kauffman networks. 

For large A, (Cl)n m the Kauffman networks grows 
like A( ffi_1 '/ 3 , where Hj is the number of invariant 
sets of L-cycle patterns [13. For the selected networks 
with one-input nodes, we have {Cl)'n x N'( Hl ~ 1 " 2 oc 
N (H L -i)/3 f or large see eq This confirms 

that the choice A' = A 2 / 3 is reasonable, but it does not 
indicate whether the proportionality factor in A' oc A 2 / 3 
is anywhere close to 1. This factor could also be depen- 
dent on L, as can be seen from the calculations in [13j. 
However, this initial guess turns out to be surprisingly 
good, as is shown in fig. 0i. 

From the good agreement for short cycles, one can ex- 
pect a similar agreement on the mean of the total number 
of attractors. This is investigated in fig. [7}p. For networks 



L-cycles 




FIG. 7: Comparison between critical K = 2 Kauffman net- 
works (thick lines) and the corresponding networks of single- 
input nodes (thin lines) . The size of the single-input networks 
is set to N' = TV 2 / 3 , r = 1 and Ar = 0, consistent with 
the Kauffman model, (a) The number of proper L-cycles for 
the L indicated in the plot. For the Kauffman networks, the 
numbers have been calculated from Monte Carlo summation 
for those network sizes where could could not be calculated 
exactly (see The number of fixed points is 1, indepen- 

dently of N, for both network types, (b) Total number of 
attractors. This quantity has been calculated analytically for 
the single-input networks, and estimated by simulations for 
the Kauffman networks using 10 2 , 10 3 , 10 4 , and 10 5 random 
starting configurations per network. 



with up to about 100 nodes, the agreement is good, and 
the extremely fast growth of (C)' N for larger N is consis- 
tent with the slow convergence in the simulations. 

As with the power law networks, we also compare the 
distributions of the numbers of different types of attrac- 
tors, and find a very strong correspondence. See fig. |H] 
Furthermore, we see indications of undersampling, in the 
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FIG. 8: A cross-section of fig. Qat N = 125, with simulation 
results for the Kauffman networks (bold lines), and the cor- 
responding single- input networks (thin lines). For the Kauff- 
man networks, we use 10 5 random starting configurations for 
1600 network realizations. The corresponding single-input 
node networks have only N' = iV 2//3 = 25 nodes, and we 
perform exhaustive searches through the state space of the 
relevant nodes in 10 6 such networks. The distributions of the 
number of attractors of different types are presented with cu- 
mulative probabilities, along with the corresponding averages 
(short vertical lines at the top and bottom of the plot). Cor- 
responding analytical averages, for the Kauffman networks, 
are marked with arrowheads. The solid lines show the num- 
ber of fixed points, whereas the dashed and dotted lines show 
the number of 2-cycles plus fixed points and the total number 
of attractors, respectively. Note that the medians are found 
where the curves for the probability distributions intersect 
1/2 on the y-axis. 

estimated numbers of fixed points and 2-cycles, for the 
Kauffman networks in fig. El as the means from the sim- 
ulations are smaller than the corresponding analytical 
values. 



SUMMARY AND DISCUSSION 

Using analytical tools, we have investigated random 
Boolean networks with single- input nodes, along with the 
corresponding random maps. For random Boolean net- 
works, we extract the exact distributions of the average 
number of cycles with lengths up to 1000 in networks 
with up to 10 5 nodes. As has been pointed out in earlier 
work [6| , we see that a small fraction of the networks have 
many more cycles than a typical network. This property 
becomes more pronounced as the system size grows, and 
has drastic effects on the scaling of the average number 
of states that belong to cycles. 

The graph of a random Boolean network of single in- 
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put nodes can be seen as a graph of a random map. Our 
analytical approach is not only applicable to Boolean dy- 
namics on such a graph, but also to random maps in gen- 
eral. Using this approach, we rederive some well-known 
results in a systematic way, and derive some asymptotic 
expansions with significantly more terms than have been 
available from earlier publications. In future research, it 
would be interesting to, e.g., see to what extent the ideas 
from |3fj| and our paper can be combined. 

Our results on random Boolean networks highlight 
some previously observed artefacts. The synchronous up- 
dates lead to dynamics that largely is governed by inte- 
ger divisibility effects. Furthermore, when counting at- 
tractors in large networks, most of them are found in 
highly atypical networks and have attractor basins that 
are extremely small compared to the full state space. We 
quantify the role of the atypical networks by comparing 
arithmetic and geometric means of the number of states 
in L-cyclcs. From analytical expressions, we find strong 
qualitative differences between those types of averages. 

The dynamics in random Boolean networks with multi- 
input nodes can to a large extent be understood in terms 
of the simpler single-input case. In direct comparisons to 
critical Kauffman networks of in-degree two and to sub- 
critical networks with power law in-degree, the agreement 
is surprisingly good. 

In [J^ > a new concept of stability in attractors of 
Boolean networks is presented. To only consider that 
type of stable attractors is one way to make more rel- 
evant comparisons to real systems. Another way is to 
focus on fixed points and stability properties as in [Hi ] 
and @. Furthermore, the limit of large systems may 
not always make sense in comparison with real systems. 
Small Boolean networks may tell more about these than 
large networks would. 

Although there are problems in making direct compar- 
isons between random Boolean networks and real sys- 
tems, we think that insight into the dynamics of Boolean 
networks will improve the general understanding of com- 
plex systems. For example, can real systems have lots of 
attractors that are never visited due to small attractor 
basins, and what implications could such attractors have 
on the system? 

A better understanding of single-input vs. multi in- 
put dynamics in Boolean networks could promote a bet- 
ter understanding of similar effects in more complicated 
dynamical systems. For the random Boolean networks, 
additional insights are required to properly explain the 
strong similarities between the single- and multiple-input 
cases. One interesting issue is to what extent a single- 
input approximation can be applied to networks with 
random rules on a fixed network graph. 



Acknowledgments 

CT acknowledges the support from the Swedish Na- 
tional Research School in Genomics and Bioinformatics. 



APPENDIX A: FUNDAMENTAL EXPRESSIONS 
FOR PRODUCTS OF LOOP OBSERVABLES 

Eq. (0} inserted into eq. and a transformation of 
the summation yield 



{G)n = E 



Nl 



N (N -v)\N 11 i>\\ V A 



W '11%. (A2) 



N (N -0)\N° H 11 A 

v ' i— 1 



Note that every term in eq. IjAlfl is split into v\ / \\^ =1 v\\ 
equal terms in eq. (|A2(1 . 



Define c^r according to 



N) 



Then 



{N-k)\N k ' 



(A3) 



(A4) 



The coefficients c% can be expressed as 



C K N= 1 



__ 

N 



N 



(A5) 



This relation, together with v = Yl^—i Ai, inserted into 
eq. HA4(1 gives 



(G) 



N 



d z 

1+ N 
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N 
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2=0 „ = i 
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(ah z x 

, v\ \ f— ' A 



(A6) 



The outer sum in eq. (|A6|) can be modified to start 
from v = without altering the value of the expression. 
This property, together with the power series expansions 



E 

fc=0 



x 
M 



and 



00 1? 
x 



(A7) 



(A8) 



k=l 



yields that eq. (|A6() can be rewritten into eq. ©• 
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APPENDIX B: ASYMPTOTES FOR PRODUCTS 
OF LOOP OBSERVABLES 

To calculate cq. (JHJl for large N, we investigate the op- 
erator (1 + d z /N) N \ z=0 . Let f(z) be a function that is 
analytic for z / 1, such that \z — 1/3| < 2/3. Further- 
more, we assume that f{z) does not have an essential 
singularity at z — 1. Then. 



1 



N 



N 



2 = 



N N 

Nl 

2ttI 



dz- 



C{e) 



■N+ 



(Bl) 
(B2) 



where e is a small positive number, and C(e) is the con- 
tour of the region where z satisfies | z — 1/3| < 2/3 and 
\z-l\ >e. 

On the curve C(e), \e Nz / 'z \ is maximal close to z = 
1, where this expression has a saddle point. Thus, the 
main contribution to the integral in eq. I|B2|) . for large 
N, comes from the vicinity of z = 1. Contributions from 
other parts of C(e) are suppressed exponentially with N . 

To find the asymptotic behavior of eq. (|B2J) . we per- 
form an expansion of f{z) around z — 1 with terms of the 
form c[— ln(l-z)] m (l-z) _a , where a, c e R, and m e N. 
Provided that the expansion has a non-zero convergence 
radius, the asymptote of eq. (|B2|) can be determined to 
any polynomial order of N. 

We start at the special case of f(z) = (1 — z)~ a . For 
non-integral a, z = 1 is a branch point of f{z). For such 
a we let /(z) be real-valued for real z < 1 and have a cut 
line at real z > 1. For AT > max(0, —a), we can change 
the integration path in eq. i|B2|) . Let C"(e) follow the line 
sR(z) — 1 but make a turn about z = 1 in the same way 
as C(e). Then, 



3 JV(*-1) 
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IC(e) ^ JC'(e) 

From Stirling's formula [3lj |. 
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and eq. l)B3jl . we get 
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Around z = 1, we have e"* 2 " 1 '^" « exp[iiV(z- 1) 2 ]. 
This approximation can be used as a starting point for 
a suitable expansion. To proceed, we note that we can 
write 



= Z{a)N a/2 , 
(B6) 



where 



Z(a) = 



27T JC'(e) 



dzexp[i( 2 -l) 2 ](l-z)- a . (B7) 



From the fast convergence of exp[i(z — l) 2 ] along ^ft(z) = 
1 for large \z\, it is clear that a i— > Z(a) is well defined 
and continuous for all a. 
With y = 1 — z, we get 



e N(z-l) 

-ot-C 1 -*)"* 



exp{iV[-y - ln(l - y)]} 



1-2/ 



(B8) 



expQiVy 2 )^ [l + + ±JVy 3 + if + ^Ny 4 
+ TsN 2 y 6 + y 3 + §Ny 5 + ^N 2 y 7 + ^N 3 y 9 
+ C(y 4 ) + NO{y 6 ) + N 2 0{y 8 ) + N 3 0{y 10 ) + 



(B9) 



We insert this result into eq. i|B5|) . and get 

(l-z)- a = 



1 



' N 
N a / 2 



z=0 
3 



^Z k {a)N- k ' 2 +0{N~ 2 ) 



Lfc=0 



(BIO) 



where 



Z (a) = Z{a) , (Bll) 

Zi(a) = Z(a- 1) + \Z{a-2>) , (B12) 

Z 2 (o) = ^Z(a) + Z(a - 2) + ^Z(a - 4) + ±Z(a - 6) , 

(B13) 

and 

Z 3 (a) = ±Z(a - 1) + §Z(a - 3) + gZ{a - 5) 

+ A Z ( a _7) + _i_ Z ( a _9) . (B14) 

Iterated differentiation of cq. IBlOf) with respect to a 
gives 



1 



dz 

N 



ln(l-z)] m (l-z)- Q 



i=0 



N a ' 2 (\\nN + d a y 



^Z k (a)N- k/2 + 0(N~ 2 ) . 

(B15) 



.fc=0 



It remains for us to calculate Z(a). For a < 1, cq. i|B7|) 
can be rewritten as 

1 f°° 

Z{a) = -j=J dxexp(-±x 2 )(-ix)~ a , (B16) 

which means that 

Z(a) = 2- a / 2 ^-!/ 2 cos(|7ra)r(i - |a) (B17) 
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for a < 1. From cq. I|B7|) and partial integration, wc find 
that 



with 



Z(a-2) = (a- l)Z(a) , 



(B18) 



which is consistent with eq. (|B17(I . Hence, cq. (|B17j) 
is valid for all a, provided that the right hand side is 
replaced with an appropriate limit in case that a is an 
odd positive integer. The use of the limit is motivated 
by the continuity of Z . 

The recurrence relation in eq. I|B18|) is useful for ex- 
pressing Z\, Z2, and Z3 in more convenient forms. Inser- 
tion into eqs. (|B11|) - (|B14|) and factorization of the ob- 
tained polynomials gives 



Z (a) = Z(a) , 

Zi(o) = \{a + l)Z{a-l) , 

Z 2 (a) = ±a(a + 2)(2a - l)Z(a) , 

and 



(B19) 
(B20) 
(B21) 



Z 3 {a) = j±o(a + l)(a + 3)(10a 2 - 15a - l)Z{a - 1) . 

(B22) 

To express Z(a) in a more convenient form than 
eq. (|B17|) . we use the relations 



9 a lnZ(a) = i(ln2+ 7 )-£] 



fe=0 



(2fc + l)(2fc + l + a) 



and 



d 2 a lnZ(a) = -J2 



1 



k=0 



{2k + 1 



(B30) 



(B31) 



When the values and derivatives of Z(a) are calculated 
for a = and a = 1, one can use the recurrence relation, 
cq. I|B18I) . to calculate the corresponding properties for 
any a £ Z. See, e.g., on infinite sums that are useful 
in those derivations. 



APPENDIX C: STATISTICS FOR 
INFORMATION CONSERVING LOOPS 

Insertion of cq. (|17J) into eqs. (|18J) and (|19|) gives 



1 + 1 



A? 



(l-rzf 



(CI) 



z=0 



L \ (fc + i)V 



(B23) 



and 



%>4n> 



fc=i 



(B24) 



where 7 is Eulcr-Mascheroni constant. See, e.g., [32| on 
eqs. (IB23P and l)B24fl . Wc now get 



fc=0 



2fc+ 1 



and 



Z(o) = 2 «/2£M 
V 7 2r(o) 

= 2 a/2 (Ml 



(B25) 



(B26) 



(B27) 



The first and second order derivatives of Z(a) can be 
expressed according to 



Z'{a) = Zd a \nZ{a) 



(B28) 



and 



Z"(a) = Zd 2 a In Z{a) + Z[d a In Z(a)} 2 , (B29) 



and 



Pjv(A = fc) = K] 1 + 



AT 



AT 



:=0 



1 



(C2) 



An alternative form of the probability generating func- 
tion in eq. ijClf) . for the special case r = 1, is presented 
in |3fjj . However, this alternative expression is compli- 
cated in comparison to cq. (|C1|) . and it is much easier 
to extract the probability distribution and corresponding 
cumulants, al ong with their asymptotic expansions, from 
cq. HClf) . In |3fj|, general considerations for probability 
generating functions are presented, along with several ex- 
amples of such functions. 

For a power series of z with convergence radius larger 
than 1, we have the operator relation 



lim 

N— >oo 



N 



N 



(C3) 



which means the limit can be extracted by inserting 2 = 1 
in the given function. In eqs. ijClf) and i|C2() . w can be 
regarded as an arbitrarily small number, which gives arbi- 
trary large convergence radii in the corresponding power 
expansions in z. Hence, the limiting probabilities for 
large TV are given by 



P 00 (pL = k) = [w k ]{l-r) 1 - w 
ln(l - 



(C4) 
(C5) 
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and 



-Poo (A = k ) 



(1 - r)r 



1 — rw 

k 



(C6) 
(C7) 



Both limiting distributions are normalized for r < 1, 
but not for r = 1. This means that the probability distri- 
butions remains localized for subcritical networks as TV 
goes to infinity. For critical networks, the probabilities 
approach zero, which means that the typical values of v 
and v must diverge with TV. 

Note that eq. (|C5(I corresponds to a Poisson distribu- 
tion with intensity ln[l/(l— r)], and that the probabilities 
in eq. (|C7|I decay exponentially with rate r. For fj,, we 
get 



1 + — 

TV 



N 



[- ln(l - rz)] 



N 

E 

fe=i 



TV!r fc 



k(N-k)\N k 



(C8) 
(C9) 



and 



(M 2 



TV 



N 



N 

E 

fc=i 



z=0 

TV!r fc 



k(N ~k)\N k 



ln(l - rz) [ln(l - rz) - 1] (CIO) 

fc-i - 



3=1 



(Cll) 



If r = 1 , /i can be seen as the number of components in a 
random map. For random maps, the result in eq. (|C9() is 
well-known and has been derived in several different ways 
HI [H I2 pj 2l l Alternative derivations of eq. (|UTT]> are 
found in [io, EjJ| . 

The distribution of /i can be calculated from cq. I|C1|I . 
To this end, we consider the series expansion 



(1-x)- 



E "tf E 

n=0 ' k=0 



(C12) 



For the number of nodes in information-conserving 
loops, eq. ljC2fl yields 



P W (A = k) 



r 



1 



kr 
TV 



k\ 



(C16) 



For r = 1, eq. I|C16|) is consistent with the correspond- 
ing results on the distribution of the number of invariant 
elements in random maps |l9j| . 

Also, cq. <|C16|) provides a simpler way to derive 
cq. <|C15ll . It is well known that the probability for a 
random permutation of n to have k cycles is given by 
^7 [£] (see, e.g., 0). Consider all nodes in information- 
conserving loops of a Boolean network with in-degree 1. 
We denote the set of such nodes by S. If we random- 
ize the network topology, under the constraint that S is 
given, the network graph in S will also be the graph of a 
random permutation of the elements in S. Then, every 
cycle in this permutation corresponds to an information- 
conserving loop in the network. In 0, [2^, the corre- 
sponding observation for random maps was made. 

When the network topology is randomized to fit with 
a given S, only the size A of S matters. Thus, 



P N {n = k\(i = n) = — 
ni 



(C17) 



Summation over all possible values of A gives 

N 

p N {v = fc) = E Pn ^ = k i f* = n ) p N(fr = n ) > ( C18 ) 

n—k 

which together with eqs. (|C16|) and (|C17|) provides a sim- 
pler derivation of eq. I|C15|) . An analogous derivation for 
random maps is presented in |2lj . 

For the first and second moments of A, we find that 



(A) 



TV 



N 



z=0 



1-rz 



N 

E 



TV!: 



J (TV - k)\N k 



(C19) 
(C20) 



and 



where [£] are the sign-less Stirling numbers (see, e.g. 
Insertion into eq. (|C1|) yields 



N 



n—k 







n — 


l 




Vi - 1" 








k- 


1 




k 


) 



TV" V n 

n=k V ' 



N 

E 



j .1 u 

TV 7 





n 




n — 1 




)( 


k 


— n 


k 


) 



n r 

TV 



(C13) 
(C14) 

(C15) 



(A 2 



TV 



^(2fc 

k=\ 



N 



rz(l + rz) 



=o (1 
1)TV! r k 



rz) 2 



(TV — k)\N k 



(C21) 
(C22) 



To better understand the results on (/x), (/x 2 ), (//), and 
(/J 2 ), we let r = 1 and calculate their asymptotes for 
large TV. For r = 1, fj, corresponds to the number of 
components in a random map, while A corresponds to 
the number of elements in its invariant set. 

From eq. i|B15ll . we find the large- TV asymptotes of 
(1 + d z /N) N \ z=0 operating on - ln(l - z), [ln(l - z)] 2 , 
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(1 — z) 1 , and (1 — z) 2 . We also note that (1 + 
d z /N) N \ z= ol = 1 for all N. From these asymptotes, 
combined with eqs. £U), (ITTTT1) . iH20l) . and (IU2"2ll for 

r = 1, we get 

{H) = i(ln27V + 7 ) + i^7V- 1 / 2 -^iV- 1 

-0(N- 2 ), (C23) 
cr 2 (/i) = ±(ln2N + 7 ) - |tt 2 + i(3- 21n2)\/27^" 1/2 
- i(7T - 2)AT 1 - ^(41 - 61n2)y2^7V- 3 / 2 
+ C(iV- 2 ), (C24) 
(A) = i^/2^V- | + i^iV" 1 / 2 _ _l_ N -i 

+ 0{N- 3 / 2 ), (C25) 

and 

cr 2 (/i) = i(4 - 7r)Ar - |V27riV - ^(3tt - 8) 

+ -LZgV^A^ 172 + ©(iV- 1 ) . (C26) 

Note that the potential term of order N~ 2 In N in 
cq. (|C24|I disappears due to cancellation when (/i) 2 is 
subtracted from (^ 2 ). 



APPENDIX D: ASYMPTOTES RELATED TO 
BOOLEAN DYNAMICS 

We take a closer look at the case that r = 1 and Ar < 
1. Eq. 1231) yields 



Fl(1,z) 



1 - z Xl 
1 - (Arz) x ^ 
To the leading order in 1 — z, we get 



l/(2Ai.) 



(Dl) 



Ai(l-z) 



1 V(2Ax,) 



.1 - (Arz) A 
Insertion into eqs. I|36[) and (|B10|) gives 

l/(2Ai) 



[1 + 0(1-*)]. (D2) 









.1 - (Arz) x t_ 


[l + O^- 1 / 2 )] . 



(D3) 

To find the asymptote of (fJz,)jv, we apply eq. (|34(l and 
find that 

d w \ w=1 F L (w, z) = 

= | y> gcd(A,£) _ ;A 



A 



E 



gcd(fcA L ,i) 



fc=i 
1 - z 



[1 - (Ar) fcAi ] 

A, i l/(2Ai) 



2fcA, 



1 - (Arz)> 



(D4) 



Let </? denote the Eulcr function. The Euler function is 
defined for n £ Z + in such a way that (p(n) is the number 
of values, k £ {l,2,...,n}, that satisfy gcd(fc,n) = 1. 
If m divides n, tp{n/m) is the number of values, k £ 
{1, 2, . . . , rt}, that satisfy gcd(fc, n) = m. From summing 
over every m £ N that divides n, we get 



m|n 



which means that 



(D5) 



(D6) 



From eq. (|D6() . we see that 



gcd(A, £) ^ A 



A 



(D7) 



A=l l\L 

Similarly, we rewrite eq. (|D4|) and get 



On 



v F L (w,z) 



£#m — 

/> i _ 



E 



£|L/A £ 

1 - Z Xl 



2£ 



1 - (Arz) £X L 

V(2Ai) 



(D8) 



.1 - (Arz) A * 

Again, we perform an expansion around z = 1 and get 



-_iF L {w,z)= [—%L + \{h 



L/X L + h L/X L l n\ L ) 

+ s L - (ht - I^x/aJ ln ( 1 ~ z )] 



A, 



/> 

1 1/(2A £ ) 



.1 - (Arz) A ^ 



where 



E 



t\L 



^-E 



in i 



(D9) 

(D10) 
(Dll) 



and 



SL 



i 



2( 



e\L/\ 

For convenience, wc define 



A L = -h L + \{h L/ - XL +h 



1 - (Arz)* 



(D12) 



L/X, 



\n\ L )+s L (D13) 
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and 

B L = h L \h L/ ~ XL . (D14) 
Insertion of 

d w \ w =iF L (w, z) = [A L - B L ln(l - z)][l + 0(1 - z)\ 

(D15) 

into eq. (|37|l . combined with eqs. IB 15(1 and (|D3|I . gives 
(n L }%= exp (In 2) U £ + §fl £ In TV + g L ) 2 _\ L ' 



^2\ L 



x [1 + 0{N- 1 ' 2 )} 



(D16) 



This means that (^l)tv grows like a power law, N UL , 
where the exponent is given by 



UL = ^{h L -±h L/ - x J . 

Finally, we derive the asymptote of (CIl)n- 
eq. (|E4|) we get 

F L (2,z)= ^"^ [1 + 0(1 -z)] 
to the leading order in powers of 1 — z, where 
h l = E J t + E J i ' 

i\L 2i\L 

= E J i ln£ + E J <>~ ln£ ' 

e\L 

and 

^-n(™) J,+ n( r ' 



l\L 



2(\L 



1 



(Arf 



(D17) 
From 

(D18) 

(D19) 
(D20) 

(D21) 



The same procedure as for the other asymptotes lets us 
find the asymptote of eq. i|35|) . We obtain 

(Sl L ) N = SLe-^N^- 1 ^ 2 ^ + OiN- 1 )} . (D22) 
Hence, (0 £ ) jy grows like a power law, N Ul , where 

H L -1 



U L 



(D23) 



Note that Hl is identical to the number of invariant sets 
of L-cycle patterns, as defined in [l3| . 



APPENDIX E: AN ALTERNATIVE EXPRESSION 

FOR F L (2,z) 



In 0|, we found that 



1 1 



x n(r7i 

2£\L v 



(Arf 



(El) 



where Jf are integers that can be calculated via the 
inclusion-exclusion principle. satisfies the relation 

2U~ = (-1) S 2^W , (E2) 

where s = YJjU Sj, d t ( B ) = Y[%MY 3 , and d\, . . . , df 
are the odd prime divisors to £. Furthermore, 



From eq. l)El|l. we can expect that 



(E3) 



where we use the convention that , 2 =0 for odd 



F L {2,z)=(l-rz)l[l- 



1 



- (rz)* 1 - (Arz) e 



Jt 



2£|L 



(r«)< 1 + (Arz)< 



(E4) 



This is indeed true, and to see that, we rewrite eq. (|E4I 
via the power series expansion 



1 °° 1 

1-x e-f 



(E5) 



fc=i 



and get 



F £ (2, *) = (!- rz) exp ]T ]T ^-g-[r M + (Ar) w ] 



<|i fe=i 



2£|Lfc=l 



(E6) 



A change of the summation order, with A = k£, yields 

00 ej+ 
F L (2,z) = (l-rz)expJ2 E -^[r A + (Ar) V 

A=l%cd(A,L) 

oo „ 

xexpE E ^qr A + (-l)^(Ar)V 



A=l ^|gcd(A,L) 
2|L« 



(E7) 



Eq. i|E7() is consistent with eq. I|34(l . provided that 



gcd(A,L) £|gcd(A,L) 
2\L/t 



and 



E + E (-i) A/ ^< _ = 2SCd(A,L) 1 1 f Ai { A 



f|gcd(A,L) £|gcd(A,L) 
2\L/l 



(E9) 
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The sum of cqs. HE8f) and HE9(1 is given by 
2£J+ + 213 i = 2 gcd(A ' L) 

£\gcd(X,L) 2£|gcd(A,L) 

which is equivalent to 

5^ {2£J+ + £J- /2 ) = 2^ x ^ 

£|gcd(A,L) 



(E10) 



(Ell) 



and 



£(2^--^ 2 )=2* 



cd(A,L) 



f|gcd(A,L) 



(E12) 



Eq. (|E 1 2p is true as a consequence of eq. (|E2|> . and hence 
eq. I|E10|I is true. 

The difference between eqs. (|E8|) and (|E9|) is 



£ 2^-=2^){l * **J* . (E13) 

£|gcd(A,L) L 1 L 1 

2|L/* 
2fA/£ 

If Ax | A, the sum in eq. 1|K13[) is empty and therefore 
equal to the right hand side. If \l \ A, eq. (|E13I) is 
equivalent to 



J2 2£J~ = 2 scd( - x ^ 



<|gcd(A,i) 
2£fgcd(A,L) 



(E14) 



consistent with eq. (|E2|) . Hence, eq. (|E13|) holds, and this 
result concludes the verification of cqs. (|E8|) and l|E9(l . 
Thus, we conclude that eq. (|E4|I is correct. 
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